13 research outputs found

    Data Provenance and Management in Radio Astronomy: A Stream Computing Approach

    Get PDF
    New approaches for data provenance and data management (DPDM) are required for mega science projects like the Square Kilometer Array, characterized by extremely large data volume and intense data rates, therefore demanding innovative and highly efficient computational paradigms. In this context, we explore a stream-computing approach with the emphasis on the use of accelerators. In particular, we make use of a new generation of high performance stream-based parallelization middleware known as InfoSphere Streams. Its viability for managing and ensuring interoperability and integrity of signal processing data pipelines is demonstrated in radio astronomy. IBM InfoSphere Streams embraces the stream-computing paradigm. It is a shift from conventional data mining techniques (involving analysis of existing data from databases) towards real-time analytic processing. We discuss using InfoSphere Streams for effective DPDM in radio astronomy and propose a way in which InfoSphere Streams can be utilized for large antennae arrays. We present a case-study: the InfoSphere Streams implementation of an autocorrelating spectrometer, and using this example we discuss the advantages of the stream-computing approach and the utilization of hardware accelerators

    A Model Selection Criterion for Classification: Application to HMM Topology Optimization

    No full text
    This paper proposes a model selection criterion for classification problems. The criterion focuses on selecting models that are discriminant instead of models based on the Occam 's razor principle of parsimony between accurate modeling and complexity. The criterion, dubbed Discriminative Information Criterion (DIC), is applied to the optimization of Hidden Markov Model topology aimed at the recognition of cursively-handwritten digits. The results show that DICgenerated models achieve 18% relative improvement in performance from a baseline system generated by the Bayesian Information Criterion (BIC)

    A Hybrid Stochastic Connectionist Approach to Automatic Speech Recognition

    No full text
    This report focuses on a hybrid approach, including stochastic and connectionist methods, for continuous speech recognition. Hidden Markov Models (HMMs) are a popular stochastic approach used for continuous speech, well suited to cope with the high variability found in natural utterances. On the other hand, artificial neural networks (NNs) have shown high classification power for short speech utterances. Therefore, we have built a hybrid system with the advantage of both Hidden Markov Models and Neural Networks. The basic idea is as follows: build a codebook from the Time-Delay Neural Networks (TDNN) output units and train HMMs using the Fuzzy-VQ algorithm. We trained several discrete HMMs for the recognition task of the Japanese phonemes using just one TDNN-generated codebook. We achieved a recognition rate of 96.1%, and in so doing, increased the recognition rate of the discrete HMMs by 7.1%. The results are an obvious proof of the possible collaboration of two different systems aime..

    Maximization of mutual information for offline Thai handwriting recognition

    No full text
    This paper aims to improve the performance of an HMM-based offline Thai handwriting recognition system through discriminative training and the use of fine-tuned feature extraction methods. The discriminative training is implemented by maximizing the mutual information between the data and their classes. The feature extraction is based on our proposed block-based PCA and composite images, shown to be better at discriminating Thai confusable characters. We demonstrate significant improvements in recognition accuracies compared to the classifiers that are not discriminatively optimizedPeer reviewe

    A Discriminative Filter Bank Model For Speech Recognition

    No full text
    This paper investigates the realization of a filter bank model that achieves minimum classification error. A bank-of-filter feature extractor module is jointly optimized with the classifier 's parameters so as to minimize the errors occurring at the back-end classifier, in the framework of Minimum Classification Error /Generalized Probabilistic Descent Method (MCE/GPD). The method was first applied to readjusting various parameters of filter banks linearly spaced on the Mel-scale for the Japanese vowel recognition task. Analysis of the feature extraction process shows how those parts of the spectrum that are relevant to discrimination are captured. Then the method was applied to a multi-speaker word recognition system, which resulted in an word error rate reduction of more than 20 %. 1. INTRODUCTION It has been argued that the interaction between signal representation and classification strongly influences speech recognition performance [2], [6]. However, the underlying nature of this..

    Discriminative Feature Extraction For Speech Recognition

    No full text
    Pattern recognition consists of feature extraction and classification over the extracted features. Usually, these two processes are designed separately, entailing that a resulting recognizer is not necessarily optimal in terms of classification accuracy. To overcome this gap in recognizer design, we introduce in this paper a new design concept, named Discriminative Feature Extraction (DFE). DFE is based on a recent discriminative learning theory, Minimum Classification Error formalization /Generalized Probabilistic Descent method, and provides an innovative way to design the entire process of recognition. A front-end feature extractor as well as a post-end classifier is consistently optimized under a single criterion of minimizing classification errors. The concept is quite general and can be applied to a wide range of pattern recognition tasks. This paper is devoted to the application of DFE to speech recognition. Experiments on a Japanese vowel recognition task show the advantages of..
    corecore